[BugFix] Fix hermes tool parser output error stream arguments in some cases (#10395) #10398

xiyuan-lee · 2024-11-16T18:50:21Z

[BugFix] Fix hermes tool parser output error stream arguments in some cases (#10395)

Use delta_text to return directly to the user, preventing errors caused by partial_json_parser during the incremental computation of parameters.

FIX #9693
FIX #9908
FIX #10395

github-actions · 2024-11-16T18:50:34Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

Add ready label to the PR
Enable auto-merge.

🚀

DarkLight1337 · 2024-11-17T06:26:14Z

cc @K-Mistele

xiyuan-lee · 2024-11-17T17:07:56Z

The function partial_json_parser will complete the partrial JSON string {"name": "tool_name", "arguments": {"arg1": " to {"name": "tool_name", "arguments": {"arg1": ""}}, which will result in an error when calculating the incremental changes in the arguments field.

githebs · 2024-11-17T17:16:30Z

Thanks everyone for making this potential fix!
I'll try on my own examples as soon as tomorrow
In the spirit of testing on the same basis, is it possible to have a minimal a very minimal python codebase requesting vllm with tools and streaming ?

xiyuan-lee · 2024-11-17T17:19:34Z

Thanks everyone for making this potential fix! I'll try on my own examples as soon as tomorrow In the spirit of testing on the same basis, is it possible to have a minimal a very minimal python codebase requesting vllm with tools and streaming ?

For the relevant code, please refer to the following link: #10395

githebs · 2024-11-17T17:21:55Z

Thanks everyone for making this potential fix! I'll try on my own examples as soon as tomorrow In the spirit of testing on the same basis, is it possible to have a minimal a very minimal python codebase requesting vllm with tools and streaming ?

For the relevant code, please refer to the following link: #10395

Totally missed that, thanks a lot!

DarkLight1337 · 2024-11-19T05:43:11Z

Please fix the linter errors.

DarkLight1337 · 2024-11-19T08:28:30Z

Don't worry about DCO, we can pass it manually if you agree to it.

… cases (vllm-project#10395) Signed-off-by: xiyuan lee <[email protected]>

xiyuan-lee · 2024-11-19T08:56:52Z

Please fix the linter errors.

I've fixed the linter errors.

DarkLight1337

Thanks for your patience!

… cases (vllm-project#10395) (vllm-project#10398) Signed-off-by: xiyuan lee <[email protected]>

… cases (vllm-project#10395) (vllm-project#10398) Signed-off-by: xiyuan lee <[email protected]> Signed-off-by: Maxime Fournioux <[email protected]>

… cases (vllm-project#10395) (vllm-project#10398) Signed-off-by: xiyuan lee <[email protected]> Signed-off-by: rickyx <[email protected]>

Sala8888 · 2024-11-22T11:09:47Z

I used the hermes_tool_parser.py as tool-parser-plugin and registered the parser as hermes_patched, but still have the same problem.

Traceback (most recent call last):
  File "/app/hermes_tool_parser.py", line 228, in extract_tool_calls_streaming
    function_name: Union[str, None] = current_tool_call.get("name")
                                      ^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'get'
Error trying to handle streaming tool call.
Traceback (most recent call last):
  File "/app/hermes_tool_parser.py", line 292, in extract_tool_calls_streaming
    args_delta_start_loc = cur_arguments_json.index(delta_text) \
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: substring not found

Here is how I start vllm service:

python3 -m vllm.entrypoints.openai.api_server \
--model /app/Qwen2.5-72B-Instruct-AWQ \
--port 7415 \
--tensor-parallel-size 2 \
--gpu-memory-utilization 0.95 \
--max-model-len 64000 \
--enforce-eager \
--disable_custom_all_reduce \
--enable-auto-tool-choice \
--tool-parser-plugin /app/hermes_tool_parser.py \
--tool-call-parser hermes_patched  \
--chat-template /app/qwen.jinja

… cases (vllm-project#10395) (vllm-project#10398) Signed-off-by: xiyuan lee <[email protected]> Signed-off-by: Tyler Michael Smith <[email protected]>

xiyuan-lee · 2024-11-23T10:14:52Z

I used the hermes_tool_parser.py as tool-parser-plugin and registered the parser as hermes_patched, but still have the same problem.

Traceback (most recent call last):
  File "/app/hermes_tool_parser.py", line 228, in extract_tool_calls_streaming
    function_name: Union[str, None] = current_tool_call.get("name")
                                      ^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'get'
Error trying to handle streaming tool call.
Traceback (most recent call last):
  File "/app/hermes_tool_parser.py", line 292, in extract_tool_calls_streaming
    args_delta_start_loc = cur_arguments_json.index(delta_text) \
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: substring not found

Here is how I start vllm service:

python3 -m vllm.entrypoints.openai.api_server \
--model /app/Qwen2.5-72B-Instruct-AWQ \
--port 7415 \
--tensor-parallel-size 2 \
--gpu-memory-utilization 0.95 \
--max-model-len 64000 \
--enforce-eager \
--disable_custom_all_reduce \
--enable-auto-tool-choice \
--tool-parser-plugin /app/hermes_tool_parser.py \
--tool-call-parser hermes_patched  \
--chat-template /app/qwen.jinja

I use vllm0.6.3.post1 docker compose, and entrypoint is:

entrypoint: ["/bin/sh", "-c", "python3 -u -m vllm.entrypoints.openai.api_server --model /data/models/Qwen2.5-72B-Instruct-AWQ --enable-auto-tool-choice --tool-call-parser hermes --tensor-parallel-size 2 --gpu_memory_utilization 0.97 --max_model_len 20000 --max_num_seq 40"])

… cases (vllm-project#10395) (vllm-project#10398) Signed-off-by: xiyuan lee <[email protected]>

mergify bot added the frontend label Nov 16, 2024

K-Mistele approved these changes Nov 17, 2024

View reviewed changes

K-Mistele mentioned this pull request Nov 17, 2024

[Bugfix] Hermes tool parser fails to check for & handle None values in some cases #9908

Closed

xiyuan-lee force-pushed the main branch from 83966cb to bcdeb9c Compare November 19, 2024 08:28

xiyuan-lee force-pushed the main branch from bcdeb9c to ee296e7 Compare November 19, 2024 08:41

[BugFix] Fix hermes tool parser output error stream arguments in some…

d3b94a7

… cases (vllm-project#10395) Signed-off-by: xiyuan lee <[email protected]>

xiyuan-lee force-pushed the main branch from ee296e7 to d3b94a7 Compare November 19, 2024 08:49

xiyuan-lee closed this Nov 19, 2024

xiyuan-lee reopened this Nov 19, 2024

DarkLight1337 approved these changes Nov 19, 2024

View reviewed changes

DarkLight1337 enabled auto-merge (squash) November 19, 2024 10:24

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 19, 2024

DarkLight1337 merged commit f028dff into vllm-project:main Nov 19, 2024
62 checks passed

coolkp pushed a commit to coolkp/vllm that referenced this pull request Nov 20, 2024

[BugFix] Fix hermes tool parser output error stream arguments in some…

0c62b94

… cases (vllm-project#10395) (vllm-project#10398) Signed-off-by: xiyuan lee <[email protected]>

KuntaiDu pushed a commit to KuntaiDu/vllm that referenced this pull request Nov 20, 2024

[BugFix] Fix hermes tool parser output error stream arguments in some…

dc41dec

… cases (vllm-project#10395) (vllm-project#10398) Signed-off-by: xiyuan lee <[email protected]>

Sala8888 mentioned this pull request Nov 23, 2024

[Bug] Streaming output error of tool calling has still not been resolved. #10589

Closed

sleepwalker2017 pushed a commit to sleepwalker2017/vllm that referenced this pull request Dec 13, 2024

[BugFix] Fix hermes tool parser output error stream arguments in some…

93dfd65

… cases (vllm-project#10395) (vllm-project#10398) Signed-off-by: xiyuan lee <[email protected]>

marcelodiaz558 mentioned this pull request Dec 18, 2024

[Bug]: Invalid tool arguments generated in v0.6.5 #11279

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BugFix] Fix hermes tool parser output error stream arguments in some cases (#10395) #10398

[BugFix] Fix hermes tool parser output error stream arguments in some cases (#10395) #10398

xiyuan-lee commented Nov 16, 2024 •

edited by github-actions bot

Loading

github-actions bot commented Nov 16, 2024

DarkLight1337 commented Nov 17, 2024

xiyuan-lee commented Nov 17, 2024 •

edited

Loading

githebs commented Nov 17, 2024

xiyuan-lee commented Nov 17, 2024

githebs commented Nov 17, 2024

DarkLight1337 commented Nov 19, 2024

DarkLight1337 commented Nov 19, 2024 •

edited

Loading

xiyuan-lee commented Nov 19, 2024

DarkLight1337 left a comment

Sala8888 commented Nov 22, 2024 •

edited

Loading

xiyuan-lee commented Nov 23, 2024 •

edited

Loading

[BugFix] Fix hermes tool parser output error stream arguments in some cases (#10395) #10398

[BugFix] Fix hermes tool parser output error stream arguments in some cases (#10395) #10398

Conversation

xiyuan-lee commented Nov 16, 2024 • edited by github-actions bot Loading

github-actions bot commented Nov 16, 2024

DarkLight1337 commented Nov 17, 2024

xiyuan-lee commented Nov 17, 2024 • edited Loading

githebs commented Nov 17, 2024

xiyuan-lee commented Nov 17, 2024

githebs commented Nov 17, 2024

DarkLight1337 commented Nov 19, 2024

DarkLight1337 commented Nov 19, 2024 • edited Loading

xiyuan-lee commented Nov 19, 2024

DarkLight1337 left a comment

Choose a reason for hiding this comment

Sala8888 commented Nov 22, 2024 • edited Loading

xiyuan-lee commented Nov 23, 2024 • edited Loading

xiyuan-lee commented Nov 16, 2024 •

edited by github-actions bot

Loading

xiyuan-lee commented Nov 17, 2024 •

edited

Loading

DarkLight1337 commented Nov 19, 2024 •

edited

Loading

Sala8888 commented Nov 22, 2024 •

edited

Loading

xiyuan-lee commented Nov 23, 2024 •

edited

Loading